Enterprise Database Systems
Big Data Analytics
Big Data Analytics: Spark for High-speed Big Data Analytics
Big Data Analytics: Techniques for Big Data Analytics

Big Data Analytics: Spark for High-speed Big Data Analytics

Course Number:
it_dlbdadj_02_enus
Lesson Objectives

Big Data Analytics: Spark for High-speed Big Data Analytics

  • discover the key concepts covered in this course
  • recognize how Spark offers an open-source, scalable, massively parallel, in-memory solution for analytics applications
  • outline the two main components of the Spark architecture: Resilient Distributed Dataset and Directed Acyclic Graph
  • describe how Spark is providing business value to Uber
  • describe how Spark is providing business value to Alibaba
  • describe how Spark is providing business value to the Healthcare industry
  • compare and name the main differences between Spark and Hadoop with respect to ease of use, latency, security, and cost
  • specify in which scenarios and conditions Spark is a better choice than its alternatives
  • list the main features of Spark, such as loading behavior, file formats, parallelism, cache, data skews
  • name the most important performance optimization techniques in Apache Spark, such as file format selection, level of parallelism, and API selection
  • name simple best practices when using Spark, like starting small or resolving skewness
  • summarize the key concepts covered in this course

Overview/Description
Spark is an open-source, massively parallel, in-memory solution that allows you to run big data analytics pipelines at high speed. Use this course to learn how Apache Spark works and gain an understanding of its architecture. As you progress, investigate the industry-leading examples of Uber and Alibaba to recognize how Spark can add business value to data in many industry types. Moving along, compare the functionality of Spark and Hadoop in relation to use cases, identifying when using Spark is most advantageous. Finally, explore fundamental Spark characteristics, optimization techniques, and best practices. When you've completed this course, you'll have a solid theoretical understanding of how and when to use Apache Spark for specific big data analytics tasks.

Target

Prerequisites: none

Big Data Analytics: Techniques for Big Data Analytics

Course Number:
it_dlbdadj_01_enus
Lesson Objectives

Big Data Analytics: Techniques for Big Data Analytics

  • discover the key concepts covered in this course
  • describe the challenges in the current data analytics models and system designs, such as scalability, consistency, reliability, efficiency, and maintainability
  • name and describe the role of the main layers of big data analytics, from the bottom all the way to the top
  • specify why unstructured data comes from variable sources and describe how it moves from its origin to storage and gets further analyzed and visualized
  • define the role of the data processing layer and specify how information captured in the previous layer is processed
  • define the role of the data storage layer using HDFS as an example of commonly used primary data storage
  • outline the main pillars and components of big data architecture
  • describe batch processing, its use cases, and common reasons for using it
  • outline how stream processing enables quick decision-making by creating actionable real-time insights
  • define the concept of Lambda architecture and outline its use cases
  • define the concept of Kappa architecture and outline its use cases
  • summarize the key concepts covered in this course

Overview/Description
Big data analytics provides a way to turn the vast amounts of data available in today's digital world into valuable insights. For this reason, big data analytics techniques have taken a central place in many businesses' IT infrastructure. These comprise complex processes and multiple stack layers that allow you to transform raw data into visualizations that demonstrate trends or other phenomena. Use this course to explore the basic principles and techniques of big data analytics in a business context. Go through each step of data processing to fully comprehend the big data analytics pipeline. Furthermore, explore various use cases of big data analytics through real-world examples. When you're done with this course, you'll have a foundational comprehension of some of the technologies behind big data and how these can drive business decisions for the better.

Target

Prerequisites: none

Close Chat Live